Modelling sub-phone insertions and deletions in continuous speech recognition

نویسندگان

Thomas Hain

Philip C. Woodland

چکیده

Recently, an extension to standard hidden Markov models for speech recognition called Hidden Model Sequence (HMS) modelling was introduced. In this approach the relationship between phones used in a pronunciation dictionary and the HMMs used to model these in context is assumed to be stochastic. One important feature of the HMS framework is the ability to handle arbitrary model to phone sequence alignments. In this paper we try to exploit that capability by using two di erent methods to model sub-phone insertions and deletions. Experiments on the Resource Management (RM) corpus and a subset of the Switchboard corpus show that, relative to standard HMM baseline, a reduction word error rate (WER) of 24.3% relative can be obtained on RM and 2.4% absolute on Switchboard.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pronunciation Variation Modelling in a Model of Human Word Recognition

Due to pronunciation variation, many insertions and deletions of phones occur in spontaneous speech. The psycholinguistic model of human speech recognition Shortlist is not well able to deal with phone insertions and deletions and is therefore not well suited for dealing with real-life input. The research presented in this paper explains how Shortlist can benefit from pronunciation variation mo...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Hidden Model Sequence Models for Automatic Speech Recognition

Most modern automatic speech recognition systems make use of acoustic models based on hidden Markov models. To obtain reasonable recognition performance within a large vocabulary framework, the acoustic models usually include a pronunciation model, together with complex parameter tying schemes. In many cases the pronunciation model operates on a phoneme level and is derived independently of the...

متن کامل

Phonetic Modelling in the Philips Chinese Continuous Speech Recognition System

We have extended the Philips large vocabulary continuous speech recognition system towards Chinese On the way from our existing Western language technology to Mandarin the rst step was to build a suitable phonetic model This paper describes the development of our phonetic model excluding tones for Mandarin Chinese We will present a systematic comparison of three forms of sub syllabic units for ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Modelling sub-phone insertions and deletions in continuous speech recognition

نویسندگان

چکیده

منابع مشابه

Pronunciation Variation Modelling in a Model of Human Word Recognition

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Hidden Model Sequence Models for Automatic Speech Recognition

Phonetic Modelling in the Philips Chinese Continuous Speech Recognition System

عنوان ژورنال:

اشتراک گذاری